# Low-resource NLP

Neurobert Mini GGUF
MIT
Statically quantized version based on boltuix/NeuroBERT-Mini, optimized for edge devices
Large Language Model Transformers
N
mradermacher
219
2
Simplestories 30M
MIT
SimpleStories is a micro model family designed for interpretability research, trained on the SimpleStories dataset, focusing on story generation tasks.
Text Generation Safetensors English
S
SimpleStories
735
1
Fewshot Xsum Bart
MIT
A few-shot summarization generation model based on BART-large, trained with 100 samples from the XSUM dataset, demonstrating the potential of few-shot learning in summarization tasks.
Text Generation
F
bhargavis
19
1
Tweety 7b Tatar V24a
Apache-2.0
A cross-token large language model developed for the Tatar language, converted from Mistral-7B-Instruct-v0.2
Large Language Model Transformers Other
T
Tweeties
37
11
Tiroberta Abusiveness Detection
A Tigrinya abusive content detection model fine-tuned on TiRoBERTa, trained on 13,717 YouTube comments dataset
Text Classification Transformers
T
fgaim
210
2
Website Classification
Apache-2.0
A website classification model based on DistilBERT, achieving an accuracy of 95.04% on an unknown dataset through fine-tuning.
Text Classification Transformers
W
alimazhar-110
3,844
37
Afrolm Active Learning
AfroLM is a pretrained language model optimized for 23 African languages, employing an active learning framework to achieve high performance with minimal data
Large Language Model Transformers Other
A
bonadossou
132
8
Banglabert Finetuned Squad
This model is a fine-tuned version of BanglaBERT on the Bengali SQuAD dataset for question answering tasks
Question Answering System Transformers
B
Naimul
15
0
Myanberta
Apache-2.0
MyanBERTa is a Burmese pre-trained language model based on the BERT architecture, pre-trained on a Burmese dataset containing 5,992,299 sentences.
Large Language Model Transformers Other
M
UCSYNLP
91
4
Roberta Base 10M 1
RoBERTa series models pretrained on datasets of varying scales (1M-1B tokens), including BASE and MED-SMALL specifications
Large Language Model
R
nyu-mll
13
1
Albert Large V2 Finetuned Rte
Apache-2.0
This model is a text classification model fine-tuned on the GLUE RTE task based on ALBERT-large-v2, used for recognizing textual entailment relationships.
Text Classification Transformers
A
anirudh21
22
0
Tiny Roberta Indonesia
MIT
This is a small RoBERTa model based on Indonesian language, specifically optimized for Indonesian text processing tasks.
Large Language Model Transformers Other
T
akahana
17
1
Roberta Base 100M 1
A RoBERTa base model pre-trained on 1B tokens with a validation perplexity of 3.93, suitable for English text processing tasks.
Large Language Model
R
nyu-mll
63
0
Roberta Base 100M 3
RoBERTa variants pre-trained on datasets ranging from 1M to 1B tokens, including BASE and MED-SMALL specifications, suitable for natural language processing tasks in resource-limited scenarios
Large Language Model
R
nyu-mll
18
0
Electra Large Generator
Apache-2.0
ELECTRA is an efficient self-supervised language representation learning method that replaces traditional generative pretraining with discriminative pretraining, significantly improving computational efficiency.
Large Language Model English
E
google
473
8
Indicbart
IndicBART is a multilingual sequence-to-sequence pre-trained model focused on Indian languages and English, supporting 11 Indian languages, built on the mBART architecture.
Large Language Model Transformers Other
I
ai4bharat
4,120
33
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase